Maximizing Top-down Constraints for Unification-based Systems 



Noriko Tomuro 

School of Computer Science, Telecommunications and Information Systems 

DePaul University 
Chicago, IL 60604 
cphdnt@ted.cs.depaul.edu 



Abstract 

A left-corner parsing algorithm with top- 
down filtering has been reported to show 
very efficient performance for unification- 
based systems. However, due to the non- 
termination of parsing with left-recursive 
grammars, top-down constraints must be 
weakened. In this paper, a general method 
of maximizing top-down constraints is pro- 
posed. The method provides a procedure 
to dynamically compute *restrictor*, a 
minimum set of features involved in an in- 
finite loop for every propagation path; thus 
top-down constraints are maximally prop- 
agated. 

1 Introduction 

A left-corner parsing algorithm with top-down filter- 
ing has been reported to show very efficient perfor- 
mance for unification-based systems (Carroll, 1994). 
In particular, top-down filtering seems to be very ef- 
fective in increasing parse efficiency (Shann, 1991). 
Ideally all top-down expectation should be propa- 
gated down to the input word so that unsuccess- 
ful rule applications are pruned at the earliest time. 
However, in the context of unification-based parsing, 
left-recursive grammars have the formal power of a 
Turing machine, therefore detection of all infinite 
loops due to left-recursion is impossible (Shieber, 
1992). So, top-down constraints must be weakened 
in order for parsing to be guaranteed to terminate. 

In order to solve the nontermination problem, 
Shieber (1985) proposes restrictor, a statically pre- 
defined set of features to consider in propagation, 
and restriction, a filtering function which removes 
the features not in restrictor from top-down expec- 
tation. However, not only does this approach fail to 
provide a method to automatically generate the re- 
strictor set, it may weaken the predicative power of 



top-down expectation more than necessary: a glob- 
ally defined restrictor can only specify the least com- 
mon features for all propagation paths. 

In this paper, a general method of maximizing 
top-down constraints is proposed. The method 
provides a procedure to dynamically compute 
*restrictor*, a minimum set of features involved in 
an infinite loop, for every propagation path. Fea- 
tures in this set are selected by the detection func- 
tion, and will be ignored in top-down propagation. 
Using ^restrictor*, only the relevant features par- 
ticular to the propagation path are ignored, thus 
top-down constraints are maximally propagated. 

2 Notation 

We use notation from the PATR-II formalism 
(Shieber, 1986) and (Shieber, 1992). Directed 
acyclic graphs (dags) are adopted as the representa- 
tion model. The symbol = is used to represent the 
equality relation in the unification equations, and 
the symbol ■ used in the form of pi ■ p2 represents 
the path concatenation of pi and p2. 

The subsumption relation is defined as "Dag D 
subsumes dag D' if D is more general than £>'." 
The unification of D and D' is notated by DUD 1 . 

The extraction function D/pl extracts the subdag 
under path pi for a given D, and the embedding 
function D\pl injects D into the enclosing dag D' 
such that D' /pi = D. The filtering function p is 
similar to (Shieber, 1992): p{D) returns a copy of D 
in which some features may be removed. Note that 
in this paper *restrictor* specifies the features to 
be removed by p, whereas in (Shieber, 1985, 1992) 
restrictor specifies the features to be retained by re- 
striction which is equivalent to p. 

3 Top-down Propagation 

Top-down propagation can be precomputed to form 
a reachability table. Each entry in the table is a 
compiled dag which represents the relation between 



a non-terminal category and a rule used to rewrite 
the constituents in the reachability relation (i.e., re- 
flexive, transitive closure of the left-corner path). 

For example, consider the following fragment of 
a grammar used in the syntax/semantics integrated 
system called LINK (Lytinen, 1992): 

rl : NP Q -> NPi POS NP2 

(NP head) = {NP2 head) 

(NPo head sem owner) = (NPi head sem) 

(This rule is used to parse phrases such as "Kris's 
desk".) 

The dag -D(l) in Figure [TQ represents the initial 
application of rl to the category NP. Note that 
the subdag under the lc arc is the rule used to 
rewrite the constituent on the left-corner path, and 
the paths from the top node represent which top- 
down constraints are propagated to the lower level. 

Top-down propagation works as follows: given a 
dag D that represents a reachability relation and 
a rule dag R whose left-hand side category (i.e., 
root) is the same as D's left-corner category (i.e., 
under its (lc 1) path), the resulting dag is Dl = 
p(D') U (R \ lc), where D' is a copy of D in which 
all the numbered arcs and lc arc are deleted and 
the subdag which used to be under the (lc 1) path 
is promoted to lie under the lc arc. Dags after the 
next two recursive applications of rl (D(2) and D(3) 
respectively^) are shown in Figure H. 

Notice the filtering function p is applied only to 
D'. In the case when p(D') = nil, the top node in 
Dl will have no connections to the rule dag under 
the lc arc. This means no top-down constraints are 
propagated to the lower level, therefore the parsing 
becomes pure bottom-up. 

In many unification-based systems, subsumption 
is used to avoid redundancy: a dag is recorded in 
the table if it is not subsumed by any other one. 
Therefore, if a newly created dag is incompatible 
or more general than existing dags, rule application 
continues. In the above example, D(2) is incompat- 
ible with -D(l) and therefore gets entered into the 
table. The owner arc keeps extending in the subse- 
quent recursive applications (as in £>(3)), thus the 
propagation goes into an infinite loop. 

3.1 Proposed Method 

Let A be a dag created by the first application of 
the rule R and B be a dag created by the second 
application during the top-down propagation.^] In 

1 Category symbols are directly indicated in the dag 
nodes for simplicity. 

2 In this case, p is assumed to be an identity function. 

3 In the case of indirect recursion, there are some in- 
tervening rule applications between A and B. 



the proposed method, A and B are first checked for 
subsumption. If B is subsumed by A, the propaga- 
tion for this path terminates. Otherwise a possible 
loop is detected. The detection function (described 
in the next subsection) is called on A and B and 
selected features are added to the *restrictor* set.0 
Then, using the updated *restrictor*, propagation 
is re-done from A. 

When R is applied again yielding B' , while B' 
is not subsumed by A, the following process is re- 
peated: if B 1 is incompatible with A, the detection 
function is called on A and B' and propagation is re- 
done from A. If B 1 is more general than A, then A 
is replaced by B 1 (thereby keeping the most general 
dag for the path) and propagation is re-done from 
B'. Otherwise the process stops for this propagation 
path. Thus, the propagation will terminate when 
enough features are detected, or when *restrictor* 
includes all the (finite number of) features in the 
grammar^] 

In the example, when the detection function is 
called on D(l) and D(2) after the first recursive ap- 
plication, the feature owner is selected and added to 
*restrictor*. After the propagation is re-done from 
D(l), the resulting dag D(4) becomes more general 
than L>(l)f] Then D(l) is replaced by £>(4), and 
the propagation is re-done once again. This time 
it results the same D (4), therefore the propagation 
terminates. 

3.2 Detection Function 

The detection function compares two dags X and Y 
by checking every constraint (unification equation) 
x in X with any inconsistent or more general con- 
straint y in Y. If such a constraint is found, the 
function selects a path in x or y and detects its last 
arc/feature as being involved in the possible loop.[] 
If x is the path constraint pi = p2 where pi and 
p2 are paths of length > 1, features may be detected 
in the following cases 

• (case 1) If both pi and p2 exist in Y, and there 
exists a more general constraint y in Y in the 
form pi ■ p3 = p2 ■ pi (length of p3 is also > 1), 
the path p3 is selected; 

4 A separate *restrictor* must be kept for each prop- 
agation path. 

5 In reality, category feature will never be in 
*restrictor* because the same rule R is applied to derive 
both A and B' . 

6 Remember D{4) = p(D(l)')U(rl\lc) where p filters 
out owner arc. 

7 This scheme may be rather conservative. 

8 Note the cases in this section do not represent all 
possible situations. 
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Figure 1: DAGs used in the example 



• (case 2) If both pi and p2 exist in Y, but the 
subdag under pi and the subdag under p2 do 
not unify, or if neither pi nor p2 exists in Y, 
whichever of pi or p2 docs not contain the lc 
arc, or cither if they both contain the lc arc, is 
selected; and 

• (case 3) If cither pi or p2 does not exist in Y, 
the one which does not exist in Y is selected. 

If x is the constant constraint pi = c (where c is 
some constant), features may be detected in the fol- 
lowing cases: 

• (case 4) If there exists an incompatible con- 
straint y of the form pi = d where d =/= c in 
Y, or if there is no path pi in Y, pi is selected; 
and 

• (case 5) If there exists an incompatible con- 
straint y of the form pi ■ p2 = c, then p2 is 
selected. 

4 Related Work 

A similar solution to the nontermination problem 
with unification grammars in Prolog is proposed in 
(Samuelsson, 1993). In this method, an operation 
called anti-unification (often referred to as general- 
ization as the counterpart of unification) is applied 
to the root and leaf terms of a cyclic propagation, 
and the resulting term is stored in the reachablity 
table as the result of applying restriction on both 
terms. Another approach taken in (Haas, 1989) 
eliminates the cyclic propagation by replacing the 
features in the root and leaf terms with new vari- 
ables. 

The method proposed in this paper is more gen- 
eral than the above approaches: if the selection or- 
dering is imposed in the detection function, features 



in *restrictor* can be collected incrementally as the 
cyclic propagations are repeated. Thus, this method 
is able to create a less restrictive *restrictor* than 
these other approaches. 

5 Discussion and Future Work 

The proposed method has an obvious difficulty: 
the complexity caused by the repeated propaga- 
tions could become overwhelming for some gram- 
mars. However, in the experiment on LINK sys- 
tem using a fairly broad grammar (over 130 rules), 
precompilation terminated with only a marginally 
longer processing time. 

In the experiment, all features (around 40 syntac- 
tic/semantic features) except for one in the example 
in this paper were able to be used in propagation. 
In the preliminary analysis, the number of edges en- 
tered into the chart has decreased by 30% compared 
to when only the category feature (i.e., context-free 
backbone) was used in propagation. 

For future work, we intend to apply the proposed 
method to other grammars. By doing the empiri- 
cal analysis of precompilation and parse efficiency 
for different grammars, we will be able to conclude 
the practical applicability of the proposed method. 
We also indend to do more exhaustive case analysis 
and investigate the selection ordering of the detec- 
tion function. Although the current definition covers 
most cases, it is by no means complete. 
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